Compilation and Communication Strategies for Out-of-Core Programs on Distributed Memory Machines

نویسندگان

Rajesh Bordawekar

Alok N. Choudhary

J. Ramanujam

چکیده

It is widely acknowledged that improving parallel I/O performance is critical for widespread adoption of high performance computing. In this paper, we show that communication in out-of-core distributed memory problems may require both inter-processor communication and le I/O. Thus, in order to improve I/O performance, it is necessary to minimize the I/O costs associated with a communication step. We present three methods for performing communication in out-of-core distributed memory problems. The rst method called the generalized collective communication method follows a loosely synchronous model; computation and communication phases are clearly separated, and communication requires permutation of data in les. The second method called the receiver-driven in-core communication considers only communication required of each in-core data slab individually. The third method called the ownerdriven in-core communication goes even one step further and tries to identify the potential future use of data (by the recipients) while it is in the sender's memory. We describe these methods in detail and present a simple heuristic to choose a communication method from among the three methods. We then provide performance results for two out-of-core applications, the two-dimensional FFT code and the twodimensional elliptic Jacobi solver. Finally, we discuss how the out-of-core and in-core communication methods can be used in virtual memory environments on distributed memory machines. Compilation and Communication Strategies for Out-of-core programs on Distributed Memory Machines Rajesh Bordawekar Alok Choudhary Electrical and Computer Engineering Department 121, Link Hall, Syracuse University, Syracuse, NY 13244 rajesh, [email protected] URL: http://www.cat.syr.edu/~frajesh,choudharg J. Ramanujam ECE Dept., Louisiana State University, Baton Rouge, LA 70803 [email protected] URL: http://www.ee.lsu.edu/jxr/jxr.html

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Compilation Techniques for Out-of-Core Parallel Computations

The difficulty of handling out-of-core data limits the performance of supercomputers as well as the potential of the parallel machines. Since writing an efficient out-of-core version of a program is a difficult task and virtual memory systems do not perform well on scientific computations, we believe that there is a clear need for compiler directed explicit I/O approach for out-of-core computat...

متن کامل

Eecient Compilation of Out-of-core Data Parallel Programs Eecient Compilation of Out-of-core Data Parallel Programs

Large scale scientiic applications, such as the Grand Challenge applications, deal with very large quantities of data. The amount of main memory in distributed memory machines is usually not large enough to solve problems of realistic size. This limitation results in the need for system and application software support to provide eecient parallel I/O for out-of-core programs. This paper describ...

متن کامل

Data Access Reorganizations in Compiling Out-of-Core Data Parallel Programs on Distributed Memory Machines

This paper describes optimization techniques for translating out-of-core programs written in a data parallel language like HPF to message passing node programs with explicit parallel I/O. We rst discuss how an out-of-core program can be translated by extending the method used for translating in-core programs. We demonstrate that straightforward extension of in-core compilation techniques does n...

متن کامل

Parallelization of Irregular Codes Including Out-of-Core Data and Index Arrays

This paper describes techniques for implementing irregular out-of-core codes on distributed memory machines. These codes involve data arrays and other data structures that are too large to t in main memory; so data needs to be stored on disks and fetched during the execution of the program. The eecient use of disk storage is a critical factor that determines the performance of these application...

متن کامل

Synonyms Parallel Communication Models Message-passing Performance Models

Bandwidth-latency models are a group of performance models for parallel programs that focus on modeling the communication between the processes in terms of network bandwidth and latency, allowing quite precise performance estimations. While originally developed for distributed-memory architectures, these models also apply to machines with non-uniform memory access (NUMA), like the modern multi-...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

J. Parallel Distrib. Comput.

دوره 38 شماره

صفحات -

تاریخ انتشار 1996

Compilation and Communication Strategies for Out-of-Core Programs on Distributed Memory Machines

نویسندگان

چکیده

منابع مشابه

Compilation Techniques for Out-of-Core Parallel Computations

Eecient Compilation of Out-of-core Data Parallel Programs Eecient Compilation of Out-of-core Data Parallel Programs

Data Access Reorganizations in Compiling Out-of-Core Data Parallel Programs on Distributed Memory Machines

Parallelization of Irregular Codes Including Out-of-Core Data and Index Arrays

Synonyms Parallel Communication Models Message-passing Performance Models

عنوان ژورنال:

اشتراک گذاری